{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using a Property Graph Store \n",
    "\n",
    "Normally in LlamaIndex, you'd create a `PropertyGraphStore`, pass it into a `PropertyGraphIndex`, and it would get used automatically for inserting and querying.\n",
    "\n",
    "However, there are times when you would want to use the graph store directly. Maybe you want to create the graph yourself and hand it to a retriever or index. Maybe you want to write your own code to manage and query a graph store.\n",
    "\n",
    "This notebook walks through populating and querying a graph store without ever using an index. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup\n",
    "\n",
    "Here, we will leverage Neo4j for our property graph store.\n",
    "\n",
    "To launch Neo4j locally, first ensure you have docker installed. Then, you can launch the database with the following docker command\n",
    "\n",
    "```bash\n",
    "docker run \\\n",
    "    -p 7474:7474 -p 7687:7687 \\\n",
    "    -v $PWD/data:/data -v $PWD/plugins:/plugins \\\n",
    "    --name neo4j-apoc \\\n",
    "    -e NEO4J_apoc_export_file_enabled=true \\\n",
    "    -e NEO4J_apoc_import_file_enabled=true \\\n",
    "    -e NEO4J_apoc_import_file_use__neo4j__config=true \\\n",
    "    -e NEO4JLABS_PLUGINS=\\[\\\"apoc\\\"\\] \\\n",
    "    neo4j:latest\n",
    "```\n",
    "\n",
    "From here, you can open the db at [http://localhost:7474/](http://localhost:7474/). On this page, you will be asked to sign in. Use the default username/password of `neo4j` and `neo4j`.\n",
    "\n",
    "Once you login for the first time, you will be asked to change the password.\n",
    "\n",
    "After this, you are ready to create your first property graph!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.graph_stores.neo4j import Neo4jPropertyGraphStore\n",
    "\n",
    "pg_store = Neo4jPropertyGraphStore(\n",
    "    username=\"neo4j\",\n",
    "    password=\"llamaindex\",\n",
    "    url=\"bolt://localhost:7687\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Inserting\n",
    "\n",
    "Now that we have the store initialized, we can put some things in it!\n",
    "\n",
    "Inserting into a property graph store consits of inserting nodes:\n",
    "- `EntityNode` - containing some labeled person, place, or thing\n",
    "- `ChunkNode` - containing some source text that an entity or relation came from\n",
    "\n",
    "And inserting `Relation`s (i.e. linking multiple nodes)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.graph_stores.types import EntityNode, ChunkNode, Relation\n",
    "\n",
    "# Create a two entity nodes\n",
    "entity1 = EntityNode(label=\"PERSON\", name=\"Logan\", properties={\"age\": 28})\n",
    "entity2 = EntityNode(label=\"ORGANIZATION\", name=\"LlamaIndex\")\n",
    "\n",
    "# Create a relation\n",
    "relation = Relation(\n",
    "    label=\"WORKS_FOR\",\n",
    "    source_id=entity1.id,\n",
    "    target_id=entity2.id,\n",
    "    properties={\"since\": 2023},\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With some entities and relations defined, we can insert them!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pg_store.upsert_nodes([entity1, entity2])\n",
    "pg_store.upsert_relations([relation])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we wanted, we could also define a text chunk that these came from"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.schema import TextNode\n",
    "\n",
    "source_node = TextNode(text=\"Logan (age 28), works for LlamaIndex since 2023.\")\n",
    "relations = [\n",
    "    Relation(\n",
    "        label=\"MENTIONS\",\n",
    "        target_id=entity1.id,\n",
    "        source_id=source_node.node_id,\n",
    "    ),\n",
    "    Relation(\n",
    "        label=\"MENTIONS\",\n",
    "        target_id=entity2.id,\n",
    "        source_id=source_node.node_id,\n",
    "    ),\n",
    "]\n",
    "\n",
    "pg_store.upsert_llama_nodes([source_node])\n",
    "pg_store.upsert_relations(relations)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, your graph should have 3 nodes and 3 relations.\n",
    "\n",
    "![low level graph](./low_level_graph.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Retrieving\n",
    "\n",
    "Now that our graph is populated with some nodes and relations, we can access some of the retrieval functions!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[EntityNode(label='PERSON', embedding=None, properties={'age': 28, 'name': 'Logan'}, name='Logan')]\n"
     ]
    }
   ],
   "source": [
    "# get a node\n",
    "kg_nodes = pg_store.get(ids=[entity1.id])\n",
    "print(kg_nodes)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[EntityNode(label='PERSON', embedding=None, properties={'age': 28, 'name': 'Logan'}, name='Logan')]\n"
     ]
    }
   ],
   "source": [
    "# get using properties\n",
    "kg_nodes = pg_store.get(properties={\"age\": 28})\n",
    "print(kg_nodes)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Logan -> WORKS_FOR -> LlamaIndex\n"
     ]
    }
   ],
   "source": [
    "# get paths from a node\n",
    "paths = pg_store.get_rel_map(kg_nodes, depth=1)\n",
    "for path in paths:\n",
    "    print(f\"{path[0].id} -> {path[1].id} -> {path[2].id}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[{'n': {'name': 'Logan', 'id': 'Logan', 'age': 28}}, {'n': {'name': 'LlamaIndex', 'id': 'LlamaIndex'}}]\n"
     ]
    }
   ],
   "source": [
    "# Run a cypher query (this will get all entity nodes)\n",
    "query = \"match (n:`__Entity__`) return n\"\n",
    "result = pg_store.structured_query(query)\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Logan (age 28), works for LlamaIndex since 2023.\n"
     ]
    }
   ],
   "source": [
    "# get the original text node back\n",
    "llama_nodes = pg_store.get_llama_nodes([source_node.node_id])\n",
    "print(llama_nodes[0].text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Upserting\n",
    "\n",
    "You may have noticed that all the insert operations are actually upserts! As long as the ID of the node is the same, we can avoid duplicating data.\n",
    "\n",
    "Lets update a node."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "new_node = EntityNode(\n",
    "    label=\"PERSON\", name=\"Logan\", properties={\"age\": 28, \"location\": \"Canada\"}\n",
    ")\n",
    "pg_store.upsert_nodes([new_node])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[EntityNode(label='PERSON', embedding=None, properties={'location': 'Canada', 'age': 28, 'name': 'Logan'}, name='Logan')]\n"
     ]
    }
   ],
   "source": [
    "nodes = pg_store.get(properties={\"age\": 28})\n",
    "print(nodes)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Deleting\n",
    "\n",
    "Deletion works similar to `get()`, with both IDs and properties.\n",
    "\n",
    "Let's clean-up our graph for a fresh start."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# delete our entities\n",
    "pg_store.delete(ids=[entity1.id, entity2.id])\n",
    "\n",
    "# delete our text nodes\n",
    "pg_store.delete([source_node.node_id])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[]\n"
     ]
    }
   ],
   "source": [
    "nodes = pg_store.get(ids=[entity1.id, entity2.id])\n",
    "print(nodes)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama-index-bXUwlEfH-py3.11",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
